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(start) 



LET 5 BE A SET OF POINTS AND k BE THE NUMBER OF CLUSTERS 



DRAW A SAMPLE R FROM S 



ENUMERATE OVER EVERY SIGNATURE q of I 
/i: DISJOINT CONJUNCTIONS 



PARTITION R INTO BUCKETS WITH POINTS x AND y IN THE SAME 
BUCKET IF THEY AGREE ON ALL LITERALS OF q 




DISCARD PRESENT 
SIGNATURE AND 
PROCEED TO NEXT 
SIGNATURE 



LET 5^, y< ABE THE 

BUCKETS INDUCED BY SIGNATURES q 



FOR EACH BUCKET S,, LET f, BE THE MOST SPECIFIC 
TERM, SATISFIED BY ALL EXAMPLES IN B 



LET C^, THE CLUSTERING INDUCED BY SIGNATURES q. BE THE 
COLLECTION OF ALL THE TERMS t 



I COMPUTE THE EMPIRICAL FREQU ENCY OF R (t^ 



\R{t, ) BE THE QUALITY OF THE CLUSTERING C 



OUTPUT THE CLUSTERING C, ASSOCIATED WITH THE RESPECTIVE 
SIGNATURE q FOR WHICH THE COMPUTED ESTIMATE Q(C ,R) IS 
MAXIMIZED 



( END ) 
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